46 research outputs found

    Small variable segments constitute a major type of diversity of bacterial genomes at the species level.

    Get PDF
    International audienceBACKGROUND: Analysis of large scale diversity in bacterial genomes has mainly focused on elements such as pathogenicity islands, or more generally, genomic islands. These comprise numerous genes and confer important phenotypes, which are present or absent depending on strains. We report that despite this widely accepted notion, most diversity at the species level is composed of much smaller DNA segments, 20 to 500 bp in size, which we call microdiversity. RESULTS: We performed a systematic analysis of the variable segments detected by multiple whole genome alignments at the DNA level on three species for which the greatest number of genomes have been sequenced: Escherichia coli, Staphylococcus aureus, and Streptococcus pyogenes. Among the numerous sites of variability, 62 to 73% were loci of microdiversity, many of which were located within genes. They contribute to phenotypic variations, as 3 to 6% of all genes harbor microdiversity, and 1 to 9% of total genes are located downstream from a microdiversity locus. Microdiversity loci are particularly abundant in genes encoding membrane proteins. In-depth analysis of the E. coli alignments shows that most of the diversity does not correspond to known mobile or repeated elements, and it is likely that they were generated by illegitimate recombination. An intriguing class of microdiversity includes small blocks of highly diverged sequences, whose origin is discussed. CONCLUSIONS: This analysis uncovers the importance of this small-sized genome diversity, which we expect to be present in a wide range of bacteria, and possibly also in many eukaryotic genomes

    FouDanGA : Fouille de données pour l'annotation de génomes d'actinomycÚtes

    Get PDF
    Poster de prĂ©sentation de l'ACI FouDanga aux journĂ©es JOBIM 2005 (Lyon) : rapport d'avancement Ă  1 an.L'accumulation des sĂ©quences issues des projets de sĂ©quençage oblige la mise en Ɠuvre de mĂ©thodes de fouilles de donnĂ©es pour comprendre les mĂ©canismes impliquĂ©s dans l'expression, la transmission et l'Ă©volution des gĂšnes. Nous nous intĂ©ressons aux mĂ©thodes combinatoires et stochastiques permettant de prĂ©dire les sĂ©quences promotrices et autres petites sĂ©quences rĂ©gulatrices chez les bactĂ©ries. Deux approches informatiques sont dĂ©veloppĂ©es. La premiĂšre correspond Ă  l'utilisation d'algorithmes de recherche de mots puis de couples de mots sur-reprĂ©sentĂ©s dans les rĂ©gions en amont de gĂšnes orthologues d'espĂšces phylogĂ©nĂ©tiquement proches. La seconde correspond Ă  une mĂ©thode de fouille de donnĂ©es gĂ©nomiques sans a priori pour faire Ă©merger des sous-sĂ©quences d'ADN dans les rĂ©gions intergĂ©niques. Le processus de fouille de donnĂ©es se traduit par la spĂ©cification de modĂšles de Markov cachĂ©s du second-ordre (HMM2), leur apprentissage et leur utilisation pour faire apparaĂźtre des irrĂ©gularitĂ©s dans des grandes sĂ©quences d'ADN

    Co-evolution of segregation guide DNA motifs and the FtsK translocase in bacteria: identification of the atypical Lactococcus lactis KOPS motif

    Get PDF
    Bacteria use the global bipolarization of their chromosomes into replichores to control the dynamics and segregation of their genome during the cell cycle. This involves the control of protein activities by recognition of specific short DNA motifs whose orientation along the chromosome is highly skewed. The KOPS motifs act in chromosome segregation by orienting the activity of the FtsK DNA translocase towards the terminal replichore junction. KOPS motifs have been identified in Îł-Proteobacteria and in Bacillus subtilis as closely related G-rich octamers. We have identified the KOPS motif of Lactococcus lactis, a model bacteria of the Streptococcaceae family harbouring a compact and low GC% genome. This motif, 5â€Č-GAAGAAG-3, was predicted in silico using the occurrence and skew characteristics of known KOPS motifs. We show that it is specifically recognized by L. lactis FtsK in vitro and controls its activity in vivo. L. lactis KOPS is thus an A-rich heptamer motif. Our results show that KOPS-controlled chromosome segregation is conserved in Streptococcaceae but that KOPS may show important variation in sequence and length between bacterial families. This suggests that FtsK adapts to its host genome by selecting motifs with convenient occurrence frequencies and orientation skews to orient its activity

    SIGffRid: A tool to search for sigma factor binding sites in bacterial genomes using comparative approach and biologically driven statistics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many programs have been developed to identify transcription factor binding sites. However, most of them are not able to infer two-word motifs with variable spacer lengths. This case is encountered for RNA polymerase Sigma (<it>σ</it>) Factor Binding Sites (SFBSs) usually composed of two boxes, called -35 and -10 in reference to the transcription initiation point. Our goal is to design an algorithm detecting SFBS by using combinational and statistical constraints deduced from biological observations.</p> <p>Results</p> <p>We describe a new approach to identify SFBSs by comparing two related bacterial genomes. The method, named SIGffRid (SIGma Factor binding sites Finder using R'MES to select Input Data), performs a simultaneous analysis of pairs of promoter regions of orthologous genes. SIGffRid uses a prior identification of over-represented patterns in whole genomes as selection criteria for potential -35 and -10 boxes. These patterns are then grouped using pairs of short seeds (of which one is possibly gapped), allowing a variable-length spacer between them. Next, the motifs are extended guided by statistical considerations, a feature that ensures a selection of motifs with statistically relevant properties. We applied our method to the pair of related bacterial genomes of <it>Streptomyces coelicolor </it>and <it>Streptomyces avermitilis</it>. Cross-check with the well-defined SFBSs of the SigR regulon in <it>S. coelicolor </it>is detailed, validating the algorithm. SFBSs for HrdB and BldN were also found; and the results suggested some new targets for these <it>σ </it>factors. In addition, consensus motifs for BldD and new SFBSs binding sites were defined, overlapping previously proposed consensuses. Relevant tests were carried out also on bacteria with moderate GC content (i.e. <it>Escherichia coli</it>/<it>Salmonella typhimurium </it>and <it>Bacillus subtilis</it>/<it>Bacillus licheniformis </it>pairs). Motifs of house-keeping <it>σ </it>factors were found as well as other SFBSs such as that of SigW in <it>Bacillus </it>strains.</p> <p>Conclusion</p> <p>We demonstrate that our approach combining statistical and biological criteria was successful to predict SFBSs. The method versatility autorizes the recognition of other kinds of two-box regulatory sites.</p

    SIGffRid: A tool to search for sigma factor binding sites in bacterial genomes using comparative approach and biologically driven statistics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many programs have been developed to identify transcription factor binding sites. However, most of them are not able to infer two-word motifs with variable spacer lengths. This case is encountered for RNA polymerase Sigma (<it>σ</it>) Factor Binding Sites (SFBSs) usually composed of two boxes, called -35 and -10 in reference to the transcription initiation point. Our goal is to design an algorithm detecting SFBS by using combinational and statistical constraints deduced from biological observations.</p> <p>Results</p> <p>We describe a new approach to identify SFBSs by comparing two related bacterial genomes. The method, named SIGffRid (SIGma Factor binding sites Finder using R'MES to select Input Data), performs a simultaneous analysis of pairs of promoter regions of orthologous genes. SIGffRid uses a prior identification of over-represented patterns in whole genomes as selection criteria for potential -35 and -10 boxes. These patterns are then grouped using pairs of short seeds (of which one is possibly gapped), allowing a variable-length spacer between them. Next, the motifs are extended guided by statistical considerations, a feature that ensures a selection of motifs with statistically relevant properties. We applied our method to the pair of related bacterial genomes of <it>Streptomyces coelicolor </it>and <it>Streptomyces avermitilis</it>. Cross-check with the well-defined SFBSs of the SigR regulon in <it>S. coelicolor </it>is detailed, validating the algorithm. SFBSs for HrdB and BldN were also found; and the results suggested some new targets for these <it>σ </it>factors. In addition, consensus motifs for BldD and new SFBSs binding sites were defined, overlapping previously proposed consensuses. Relevant tests were carried out also on bacteria with moderate GC content (i.e. <it>Escherichia coli</it>/<it>Salmonella typhimurium </it>and <it>Bacillus subtilis</it>/<it>Bacillus licheniformis </it>pairs). Motifs of house-keeping <it>σ </it>factors were found as well as other SFBSs such as that of SigW in <it>Bacillus </it>strains.</p> <p>Conclusion</p> <p>We demonstrate that our approach combining statistical and biological criteria was successful to predict SFBSs. The method versatility autorizes the recognition of other kinds of two-box regulatory sites.</p

    Recherche des sites de régulation de la transcription dans des génomes bactériens

    No full text
    Texte intégral accessible uniquement aux membres de l'Université de LorraineMany programs have been developed to identify transcription factor binding sites. Most of them are not able to infer two-word motifs with variable spacer lengths, characteristics of RNA polymerase Sigma (s) Factor Binding Sites (SFBSs). The aim of this thesis is to design an algorithm taking into account the biological structural observations about these sites, in order to their relevant prediction. We describe a new approach, SIGffRid (SIGma Factor binding sites Finder using R?MES to select Input Data), to identify SFBSs by comparing two related bacterial genomes. The method performs a simultaneous analysis of pairs of promoter regions of orthologous genes. SIGffRid uses a prior identification of over-represented patterns in whole genomes as selection criteria for potential -35 and -10 boxes. These patterns are then grouped using pairs of short seeds, allowing a variable-length spacer between them. This is followed by motif extension guided by statistical considerations. Finally, statitically feasible and relevant motifs are selected. We applied our method to the pair of related bacterial genomes of Streptomyces coelicolor A3(2) and Streptomyces avermitilis. We demonstrate that our approach combining statistical and biological criteria was successful to predict SFBSs, and envisage ameliorations.Nombre de programmes ont été développés pour identifier des sites de fixation de facteurs de transcription. La plupart ne sont pas capables d'inférer des motifs composés de deux mots en autorisant une variation de leur espacement, caractéristiques des sites de fixation des sous-unités s de l'ARN polymérase (SFFS). Cette thÚse vise à l'élaboration d'un algorithme prenant en compte toutes les connaissances biologiques structurelles de ces sites en vue de leur prédiction fiable. Nous présentons une nouvelle approche, SIGffRid (pour SIGma Factor Finder using R?MES to select Input Data), pour l'identification des SFFS qui compare deux génomes bactériens phylogénétiquement apparentés. La méthode analyse des paires de régions promotrices de gÚnes orthologues. Elle utilise la sur-représentation statistiquement dans les génomes complets comme critÚre de sélection des boßtes -35 et -10 potentielles. Des motifs composites conservés sont alors groupés en utilisant des paires de courtes graines, en autorisant la variabilité de l'espacement qui les sépare. Les motifs sont ensuite étendus suivant des considérations statistiques. Les plus significatifs sont retenus. Cet algorithme a été appliqué avec succÚs à la paire de génomes bactériens apparentés de Streptomyces coelicolor A3(2) et Streptomyces avermitilis. Nous démontrons que notre approche, combinant des critÚres statistiques et biologiques, parvient à prédire des SFFS, et abordons les améliorations envisagées

    Recherche des sites de régulation de la transcription chez Streptomyces Coelicolor A3(2)

    No full text
    Stage de DEA. Rapport de stage.Recherche des sites de régulation de la transcription par comparaison des régions amonts des gÚnes potentiellement orthologues entre toutes les espÚces comparées (2 à 4)

    Searching for transcriptional regulatory sites in bacterial genomes

    No full text
    Nombre de programmes ont Ă©tĂ© dĂ©veloppĂ©s pour identifier des sites de fixation de facteurs de transcription. La plupart ne sont pas capables d’infĂ©rer des motifs composĂ©s de deux mots en autorisant une variation de leur espacement, caractĂ©ristiques des sites de fixation des sous-unitĂ©s s de l’ARN polymĂ©rase (SFFS). Cette thĂšse vise Ă  l’élaboration d’un algorithme prenant en compte toutes les connaissances biologiques structurelles de ces sites en vue de leur prĂ©diction fiable. Nous prĂ©sentons une nouvelle approche, SIGffRid (pour SIGma Factor Finder using R’MES to select Input Data), pour l’identification des SFFS qui compare deux gĂ©nomes bactĂ©riens phylogĂ©nĂ©tiquement apparentĂ©s. La mĂ©thode analyse des paires de rĂ©gions promotrices de gĂšnes orthologues. Elle utilise la sur-reprĂ©sentation statistiquement dans les gĂ©nomes complets comme critĂšre de sĂ©lection des boĂźtes -35 et -10 potentielles. Des motifs composites conservĂ©s sont alors groupĂ©s en utilisant des paires de courtes graines, en autorisant la variabilitĂ© de l’espacement qui les sĂ©pare. Les motifs sont ensuite Ă©tendus suivant des considĂ©rations statistiques. Les plus significatifs sont retenus. Cet algorithme a Ă©tĂ© appliqueÂŽ avec succĂšs Ă  la paire de gĂ©nomes bactĂ©riens apparentĂ©s de Streptomyces coelicolor A3(2) et Streptomyces avermitilis. Nous dĂ©montrons que notre approche, combinant des critĂšres statistiques et biologiques, parvient Ă  prĂ©dire des SFFS, et abordons les amĂ©liorations envisagĂ©es.Many programs have been developed to identify transcription factor binding sites. Most of them are not able to infer two-word motifs with variable spacer lengths, characteristics of RNA polymerase Sigma (s) Factor Binding Sites (SFBSs). The aim of this thesis is to design an algorithm taking into account the biological structural observations about these sites, in order to their relevant prediction. We describe a new approach, SIGffRid (SIGma Factor binding sites Finder using R’MES to select Input Data), to identify SFBSs by comparing two related bacterial genomes. The method performs a simultaneous analysis of pairs of promoter regions of orthologous genes. SIGffRid uses a prior identification of over-represented patterns in whole genomes as selection criteria for potential -35 and -10 boxes. These patterns are then grouped using pairs of short seeds, allowing a variable-length spacer between them. This is followed by motif extension guided by statistical considerations. Finally, statitically feasible and relevant motifs are selected. We applied our method to the pair of related bacterial genomes of Streptomyces coelicolor A3(2) and Streptomyces avermitilis. We demonstrate that our approach combining statistical and biological criteria was successful to predict SFBSs, and envisage ameliorations

    Recherche des sites de régulation de la transcription dans des génomes bactériens

    No full text
    Nombre de programmes ont été développés pour identifier des sites de fixation de facteurs de transcription. La plupart ne sont pas capables d inférer des motifs composés de deux mots en autorisant une variation de leur espacement, caractéristiques des sites de fixation des sous-unités s de l ARN polymérase (SFFS). Cette thÚse vise à l élaboration d un algorithme prenant en compte toutes les connaissances biologiques structurelles de ces sites en vue de leur prédiction fiable. Nous présentons une nouvelle approche, SIGffRid (pour SIGma Factor Finder using R MES to select Input Data), pour l identification des SFFS qui compare deux génomes bactériens phylogénétiquement apparentés. La méthode analyse des paires de régions promotrices de gÚnes orthologues. Elle utilise la sur-représentation statistiquement dans les génomes complets comme critÚre de sélection des boßtes -35 et -10 potentielles. Des motifs composites conservés sont alors groupés en utilisant des paires de courtes graines, en autorisant la variabilité de l espacement qui les sépare. Les motifs sont ensuite étendus suivant des considérations statistiques. Les plus significatifs sont retenus. Cet algorithme a été applique avec succÚs à la paire de génomes bactériens apparentés de Streptomyces coelicolor A3(2) et Streptomyces avermitilis. Nous démontrons que notre approche, combinant des critÚres statistiques et biologiques, parvient à prédire des SFFS, et abordons les améliorations envisagées.Many programs have been developed to identify transcription factor binding sites. Most of them are not able to infer two-word motifs with variable spacer lengths, characteristics of RNA polymerase Sigma (s) Factor Binding Sites (SFBSs). The aim of this thesis is to design an algorithm taking into account the biological structural observations about these sites, in order to their relevant prediction. We describe a new approach, SIGffRid (SIGma Factor binding sites Finder using R MES to select Input Data), to identify SFBSs by comparing two related bacterial genomes. The method performs a simultaneous analysis of pairs of promoter regions of orthologous genes. SIGffRid uses a prior identification of over-represented patterns in whole genomes as selection criteria for potential -35 and -10 boxes. These patterns are then grouped using pairs of short seeds, allowing a variable-length spacer between them. This is followed by motif extension guided by statistical considerations. Finally, statitically feasible and relevant motifs are selected. We applied our method to the pair of related bacterial genomes of Streptomyces coelicolor A3(2) and Streptomyces avermitilis. We demonstrate that our approach combining statistical and biological criteria was successful to predict SFBSs, and envisage ameliorations.NANCY1-Bib. numérique (543959902) / SudocSudocFranceF

    DNA motifs that sculpt the bacterial chromosome

    No full text
    During the bacterial cell cycle, the processes of chromosome replication, DNA segregation, DNA repair and cell division are coordinated by precisely defined events. Tremendous progress has been made in recent years in identifying the mechanisms that underlie these processes. A striking feature common to these processes is that non-coding DNA motifs play a central part, thus 'sculpting' the bacterial chromosome. Here, we review the roles of these motifs in the mechanisms that ensure faithful transmission of genetic information to daughter cells. We show how their chromosomal distribution is crucial for their function and how it can be analysed quantitatively. Finally, the potential roles of these motifs in bacterial chromosome evolution are discussed
    corecore